Back to Blog

TrueACT Uses Half The Parameters And I Am Not Sharing The Code

I am redefining what a neuron can do. The project is called TrueACT. It is going well. I am not open sourcing it. That decision feels heavy. It also feels necessary. Both feelings are valid.

Sometimes the best ideas are the ones you keep to yourself until they are ready. Sometimes ready means never. I am still deciding which category this falls into.

The Core Idea

It started with a simple observation. A single linear neuron can only represent a plane in three dimensional space. Multiplication lives on a curved manifold. Standard gradient descent finds a best fit plane that fails outside a narrow range. The fix was to transform the problem into log space.

# The log-space transformation
Output = sign(x * y) * exp(w1 * ln|x| + w2 * ln|y| + b)

When w1=1, w2=1, b=0, this perfectly represents multiplication.
Training in log-space produces stable, linear gradients.
# Simple. Elegant. Probably obvious in retrospect.

This led to a deeper realization. Language and math operate in two different coordinate systems. Linear space is good at addition, superposition, counting, and memory. It is bad at multiplication, ratios, and logic chains. Log space is good at multiplication, ratios, scaling, and logic. It is bad at addition, memory, and frequency counting.

Real intelligence needs both. A single coordinate system cripples the model on half the problem. TrueACT provides both.

The Architecture

TrueACT is a recursive, confidence based agent. Each layer loops over its internal state until cumulative action confidence hits ninety nine percent. It has a think cell for recurrent state updates. It has a router with four mode softmax. It has experts gated by router weights.

2x
Parameter Efficiency
4
Expert Modes
99%
Confidence Threshold
40
Lessons Learned

The experts include a standard MLP with SiLU activation for frequency counting and pattern matching. A fancy expert that applies log space transformation for ratios and logical rules. A memory vault for differentiable key value associative retrieval. The router learns to dispatch each token to the right expert at the right time.

The Showdown

I ran a controlled comparison. Three layer TrueACT versus three layer standard LLaMA with SwiGLU MLP. Both trained on identical mixed data. Multiplication plus text. The results are uncomfortable.

Metric Standard LLaMA TrueACT
Loss 0.0884 0.0880
Parameters 852,864 428,652
Training Speed 1x 2.8x slower

TrueACT matches standard LLaMA loss with fifty percent fewer parameters. The recursive thought loop and log space routing buy genuine algorithmic efficiency. The trade off is compute. TrueACT trains 2.8 times slower due to sequential thought steps. Each forward pass runs twelve iterations of the router expert loop instead of a single feed forward pass.

Why I Am Not Open Sourcing

I need time to understand what I have built. I need time to test it thoroughly. I need time to decide whether this is a breakthrough or a clever trick that happens to work on my benchmarks. Releasing prematurely helps no one. Releasing confidently helps everyone.

Open source is a promise. It says I will maintain this. I will answer questions. I will fix bugs. I am not ready to make that promise yet. The code is messy. The documentation is incomplete. The benchmarks are preliminary.

Also I am selfish. I want the satisfaction of finishing it myself. I want the credit. I want the control. These are not noble motivations. They are human. I am accepting them.

The Journey

This project represents forty lessons of iterative experimentation. Lesson one through five explored a single log space neuron learning multiplication. Lesson six through nine discovered that log space cannot add while linear space cannot multiply. Lesson ten through fourteen found that pure log space deep networks collapse while sandwich hybrids stabilize.

Lesson fifteen through twenty four developed mixture of experts with hierarchical routing that worked but was three times slower. Lesson twenty five through thirty added memory vault neurons and internal thought loops with one hundred percent confidence on associative retrieval. Lesson thirty one through thirty six solved end to end router collapse with adaptive computation time. Lesson thirty seven through forty one achieved the LLaMA showdown with two times fewer parameters at same loss.

Each lesson is documented. Each failure is preserved. Each breakthrough is logged. The archive is private for now. It will remain private until I am ready.

What Comes Next

I will keep testing. I will keep refining. I will keep documenting. When TrueACT is ready, I will release it. Or I will not. The decision is mine. The timeline is flexible. The work continues.

I am also training Glint variants. I am also managing the AIExpermentLab. I am also dealing with Enderchefcoder being on break. The workload is heavy. The progress is real. The caffeine supply is depleted.

Final Thoughts

TrueACT uses half the parameters for the same loss. It is recursive. It is routed. It is not open source. I am redefining what a neuron can do. I am taking my time. I am being selfish. I am also being honest.

If you want to see the code, wait. If you want to understand the architecture, read this blog. If you want to use it, wait longer. Patience is a virtue. Secrecy is a strategy. Both are temporary.

Progress is weird. Control is comforting. Sharing is scary. I will share when I am ready. Until then the work continues in private. The benchmarks improve. The parameters shrink. The confidence grows.